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OUTLINE  OF  LECTURES 


I.  Examples  of  problems  to  be  considered.  A  one-dimensional 
example  for  introductory  purposes.  The  reduction  of  many 
approximation  problems  to  an  overdetermined  system  of 

]. inear  equations.  Possibility  of  non-uniqueness  in  mini¬ 
max  solutions . 

II.  Theoretical  background.  Convexity.  Linear  inequalities. 
Characterizing  the  solution  of  minimax  problems.  Solution 
of  systems  involving  n  +  1  equations  in  n  unknowns. 

III.  The  exchange  theorem.  An  ascent  algorithm  for  the  minimax 
problem.  Formulas  for  computation.  Using  the  algorithm  to 
solve  linear  inequalities. 

IV.  Rational  approximation  problem.  Existence  of  best  approxi¬ 
mations.  Pitfalls.  Changing  rational  functions  into  con¬ 
tinued  fraction  form  for  fast  computing.  A  linear  inequality 
method  for  rational  approximations.  A  weighted  minimax 
algorithm . 

V.  The  differential  correction  algorithm.  Fade  approximations. 
Examples . 


LECTURE  I 


We  begin  by  explaining  what  is  meant  by  a  Tchebycheff  approximation, 
A  simple  example,  taken  from  the  book  "Approximations  for  Digital  Com¬ 
puters"  by  Cecil  J.  Haistings  (Princeton  University  Press,  1955)  is  as 
follows : 


Arc tan  x 


CiX  +  C^x 


3 


+  c 


c^x 


7 


c^  =  .9992150 

c^  =  -.3211819 

c^  =  .1462766 

c^  =  -,0589929 


e  =  .00008 
for  0  <  X  <  1 


The  number  €  is  the  maximum  discrepancy  on  the  interval  [0,1]  between 
Arc  tan  x  and  the  polynomial  approximation.  This  alone  would  not  Justify 
the  appellation  "Tchebycheff  approximation".  The  crucial  fact  is  that  the 
number  e  can  not  be  improved  (decreased)  by  any  adjustments  in  the  co¬ 
efficients  given  above.  That  is,  we  have  reduced  the  number 


e  =  max 
0<  x<  1 


Arctan  x  -  (c^x 


5  5  7 

+  +  c^x  +  c^x  )  1 


to  an  absolute  minimum  by  chosing  the  coefficients  c^,...,  c^^  as  shown. 

In  this  brief  course  we  shall  develop  methods  for  the  numerical  determination 
of  the  coefficients  in  such  a  Tchebycheff  approximation.  Our  techniques 
are  not  at  all  restricted  to  polynomial  approximation,  however.  Their  scope 
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is  illustrated  by  the  following  typical  problems  to  which  they  apply. 

(1)  Find  a  polynomial  P(x)  of  lowest  degree  such  that  on  the  interval 

—ft 

[0,?]  ,  |P(x)  -  sinx|  <  10  sin  x 


(2)  Find  a  polynomial  in  two  variables  of  the  form  P(x,y)  =  S  c.  .x  y 

i+ j  <4 

such  that  max  |  f(x,y)  -  P(x,y)  |  is  a  minimum,  f  being  a  prescribed  function. 
1x1  <1 


(3)  Find  a  rational  function  R(x)  of  lowest  total  degree  such  that  on 
the  interval  [0,1]  ,  | R(x)  -  Arctanxj  <  10  . 


(4)  Given  a  function  f(x)  which  is  known  only  at  certain  points  x^,...,x^, 
find  a  polynomial  P(x)  of  degree  5  for  which  the  expression 


max  |f(x.)  -  P(x,) 
1  <  i  <  m  ^  ^ 


is  an  absolute  minimum. 


(5)  Given  an  overdetermined  system  of  linear  equations 


E  a .  .  X  .  =  d  . 


( i  =  1 ,  .  .  . ,  m ) 


find  an  approximate  solution  x  =  (x^,...,x^)  for  which  the  expression 


1  <  i  <  m 


E  a,  .  x  .  -  d . 
1=1  ^  " 


is  an  absolute  minimum. 
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(6)  Continuous  functions  f  ,  ,  •  •  ■ ,  being  prescribed,  find  the 

best  approximation  of  f  by  a  linear  combination  of  on  an 

interval  [a,b]  .  That  is,  determine  coefficients  c^ .  c  in  such  a 

way  that  the  expression 

n 

max  I  f(x)  -  Sc.  g.  (x)  | 
a<x<b  i=l  ^  ^ 

shall  be  an  absolute  minimum. 


The  algorithms  which  we  will  develop  subsequently  are  capable  of 
handling  all  these  problems.  The  most  fundamental  of  these  problems  is 
number  (5)  ,  the  approximate  solution  of  overdetermined  linear  equations. 
For  practical  purposes  many  approximation  problems  may  be  put  into  this 
form.  For  example,  suppose  we  wish  to  calculate  the  coefficients  in  the 
approximation  cited  earlier: 

5  5  7 

Arctan  x  C-X  +  c_x  +  c^x  +  c,  x 

1254 


On  the  interval  [0,1]  let  us  take  a  large  number  of  points  x^ , . . . ,  x^  . 
We  then  wish  to  determine  the  coefficients  c^,...,  c^  so  as  to  minimize 
the  expression 


max 

1  <  i  <  m 


3  5  7 

Arctan  x.  -  (c,x.  +  c-x.^  +  C-X."^  +  c.x.')  I 
1  li  2i  31  4i 


If  we  put 


a.  . 
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x,^^  ^  and  d.  =  Arctan  x.  ,  then  we  seek  to  minimize 
1  1  1  ’ 


max 

1<  i<m 


4 

5 

j=l 


a .  .  c  . 

j 


k 


which  is  an  instance  of  problem  (5).  One  may  object  that  we  have 
replaced  one  problem  by  another  and  that  the  solution  of  the  second  need 
not  be  close  to  a  solution  of  the  original.  However,  it  is  possible  to 
prove  under  suitable  hypotheses  that  the  approximation  taken  on  a  finite 
set  of  points  approaches  the  approximation  for  the  interval,  as  the  finite 
set  "fills  out"  the  interval.  The  interested  reader  may  consult  my  Boeing 
Document  No.  DI-82-OI85,  "The  Relationship  between  the  Tchebycheff  Approxi¬ 
mations  on  an  Interval  and  on  a  Discrete  Subset  of  that  Interval",  for  a 
discussion  of  this  problem. 

For  reasons  set  forth  above,  we  are  going  to  consider  first  the 
problem  of  minimizing  an  expression 

n 

A ( X )  =  max  I  S  a .  . x  .  -  d .  | 
l<i<m  j-1  ^ 

where  the  data  a^^  and  d^  are  prescribed  (real)  numbers.  In  order  to 
see  what  to  expect,  let  us  exaunine  a  simple  example,  in  which  n  =  1  and 
m  =  4  : 


2x  =  1.2 
4x  =  2.1 
5x  =  2.6 
6x  =  5.1 

It  is  clear  that  any  approximate  solution  to  this  system  of  equations  should 
lie  between  .5  and  .6  .  We  seek  to  minimize  the  function 

A(x)  =  max  {  1 2x  -  1.2l  ,  l4x  -  2.li  ,  1 5x  -  2.61  ,  1 6x  -  3.1I  }  . 


We  shall  do  this  graphically  in  order  to  gain  some  insight  into  the 
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and  is  therefore  x  =  .5375 

Several  features  of  this  example  that  will  presist  when  n  and  m 
are  much  greater  deserve  to  be  pointed  out.  First  of  all,  the  solution 
is  obtainable  by  solving  a  simple  linear  equation,  but  a  relatively  great 
expenditure  of  effort  went  into  the  discovery  of  this  equation.  Secondly, 
the  solution  is  a  point  where  two  of  the  residual  functions,  defined  by 

r^(x)  =  2x  -  1.2 

r^Cx)  =  4x  -  2.1 

r^(x)  =  5x  -  2.6 

r^(x)  =  6x  -  3.1 

are  equal  in  magnitude.  (In  the  general  case  the  solution  will  be  a  point 
where  n  +  1  of  the  residuals  are  equal  in  magnitude.)  Finally,  the 
minimum  point  of  our  function  is  the  same  as  the  minimum  point  of  the 
simpler  function 

A^(x)  =  max  {  l2x  -  1.2l  ,  l6x  -  3.l|  } 

The  numerical  implications  of  this  observation  are  quite  important.  A 
large  value  of  m  will  not  make  the  computations  unstable  or  ill-conditioned J 
it  will  simply  involve  a  higher  number  of  iterations  to  locate  the  appropri¬ 
ate  system  of  n  +  1  equations  which  determines  the  solution.  Those  who 
are  familiar  with  least-squares  computations  will  realize  that  this  is  an 
important  advantage.  The  least-squares  solution  of  a  matrix  equation 


A  X  =  d 
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is  obtained  as  the  exact  solution  of 

T  T 

A  Ax  =  Ad  . 

T 

Simply  forming  A  A  when  m  is  large  may  involve  great  round-off 
errors.  Generally  speaking,  a  Tchebycheff  solution  of  a  system  of 
equations  can  be  calculated  with  higher  precision  than  a  least-squares 
solution. 

It  should  be  remarked  that  a  Tchebycheff  solution  of  an  over- 
determined  system  of  linear  equations  need  not  be  unique.  For  example, 
the  Tchebycheff  solutions  of  the  system 

X  =  1 
Ox  =  1/2 

2  X  =  1 

: 

fill  out  an  interval,  ^  general  case  this  can  occur  only 

if  some  n  X  n  submatrices  from  A  are  singular. 

Proceeding  now  to  the  general  problem  of  minimizing  the  function 

n 

A(x)  =  max  |  E  a..x.-d,  I  , 

-1  ^  ^  iJ  1  1 


we  shall  prove  that  A  cannot  have  any  purely  local  minima,  that  is,  it 
cannot  have  a  graph  sucR  as  tlie  following. 


For,  if  possible,  let  x  and  y  be  two  local  minimum  points  of  A  . 
For  Oe  [0,1]  we  find  that 

A[©  X  +  (1  -  ©)  y]  =  max  lEa..  [©x.  -  (1  -  ©)y.]  -d.  | 

^  3  3  ^ 

<  ©  max;  I  Ea..x.  -d.l+  (l-©)  max  1  S  a.  .  y  . 
i  ij  J  1  i  ij  J 

=  ©A  (x)  +  (1  -  ©)  A  (y)  . 

(Thus  the  function  A  ft  convex.)  If  A(x)  <  A(y)  ,  then  some  points 
near  y  will  have  lower  values  of  A  than  y  .  This  can  be  seen  by 
taking  values  of  ©  near  0  .  On  the  other  hand,  if  A(x)  =  A(y)  , 

then  there  can  be  no  higher  points  of  the  graph  between  them  since 
A  [©  X  +  (1  -  ©) y  ]  <  A  (x)  .  A  similar  proof  would  apply  to  the  more 

complicated  situation 


The  implication  of  this  fact  for  the  problem  considered  earlier, 

3  5  7 

Arctan  x  as  c,x  +  c_x  +  c^x  +  c,  x 

1234 

is  that,  if  no  "infinitesimal"  alteration  of  the  coefficients  can  decrease 
the  maximum  discrepancy  c  ,  then  no  "massive"  alteration  of  the  coefficients 
will  do  so  either. 


It  should  be  pointed  out  that  the  problem  of  locating  the  minimum 
point  of  the  function 


max 

i 

* 


S  a.  .  X  .  -  d.  I 

.  xj  3  X 


can  be  solved  by  "linear  programming".  To  do  so  we  introduce  another 
variable  s  and  ask  that  it  be  a  minimum  under  the  conditions 


S  a.  .  X .  -  d.  <  e 
xj  3  X  - 


-Sa.  .x.  +  d.<e 
X3  3  X  - 


This  is  a  standard  problem  of  minimizing  a  linear  function  with  linear  in¬ 


equality  constraints 
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LECTURE  II 


A  basic  concept  for  what  follows  is  that  of  a  convex  set .  A  set 
K  is  convex  if 

X  G  K 

y  C  K 
0  <  X  <  1 

Thus  with  any  two  points  x  ,  y  in  K  ,  the  line  segment  joining  x 
and  y  lies  also  in  K  .  Given  any  set  U  ,  another  set  (U)  is 

determined  by  specifying  that  it  contain  all  linear  combinations 

S  X.  u. 

1  1 

« 

in  which  u,  C  U  ,  X.  >  0  ,  and  E  X.  =  1  .  The  number  of  su/nmands 
in  the  linear  combination  is  arbitrary  (but  always  finite) .  The  set  9C.(U) 
is  easily  shown  to  be  convex  and  is  called  the  convex  hull  of  D  .  The 
convex  hull  of  three  points,  for  example,  is  the  triangle  having  those 
points  as  vertices.  An  important  theorem  states  that  in  an  n-dimensional 
linear  space,  the  sum  E  tie  restricted  to  just  n  +  1  terms. 


=i>  Xx  +  (1  -  X)  y  G  K 


Theorem  of  Caratheodory 

of  "K (U)  can  be  written 

and  E  X.  =  1 
.  1 


In  an  n-dimensional  space  every  point 
n 

X  =  E  X.  u.  where  u.  G  U  ,  X  >  0  . 

i=0  ^  ^  ^  “ 


k 

Proof  If  xG  X(U)  ,  then  x  =  E  X.  u.  with  X_.  >  0  ,  u^  G  D  ,  and 

i=0 
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S  =  1  .  Let  us  assume  that  k  is  minimal. .  yet  k  >  n  ,  ajid  try  to 

^  k 

obtain  a  contradiction.  Since  2  X..(x  -  u.)  =0  the  points  y.  =  x  —  u. 

0  1  1  11 

are  linearly  dependent.  Since  we  are  assuming  k  >  n  ,  the  points 

k 

y,  , . . . ,  y,  are  cilso  linearly  dependent,  say  2  a.  y.  =  0  .  Since  k  was 

X  rC  ^  X  1 

minimal  ,  >  0  .  Put  =  0  .  Clearly 

k 

2(X.  +ta.)y.  =0 

Q  1  1-^1 


for  all  t  .  When  t  =  0  ,  the  coefficients  are  positive .  As  we  increase 

|t|  the  coefficients  remain  positive  for  a  while  but  eventually  one  will 
vanish.  They  don't  all  vanish  because  +  t  >  0  .  But  if  we  are 

careful  to  take  the  first  t  ,  at  least  one  coefficient  will  be  0  ,  while 
those  that  are  not  0  are  positive.  Going  back  to  x  and  u  ,  we  contra¬ 
dict  the  minimality  of  k  •  ,  in  as  much  as  2(X.  +  ta.)(x  -  u.  )  =  0  whence 
•  ♦  111 

X  =  2  (X.  +  ta.)u./2(X.  +  ta.)  .  The.  division  at  the  end  makes  the  co- 
1  11  1  1 

efficient  add  up  to  1  . 


Theorem  on  Linear  Inequalities  A  system  of  linear  inequalities 
n 

'2a.  .X.  >0  i  =  l,...,m 

j=l  ^ 

is  consistent  if  and  only  if  0  i  3CfA^,...,  A^  j  .  Here  A^ 

denotes  the  n-tuple  (a..,,...,  a.  )  . 

^  il ’  ’in 


Proof  If  the  convex  set  K  =  K(Ai,...A^]  does  not  contain  the  origin, 

let  X  be  the  point  of  K  closest  to  0  .  Then  for  any  i  ,  and  for 
X  C  [0,1]  ,  XA^  +  (1  -  X)x  C  K  .  Consequently 


any 
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0  <  II  X  +  (1  -  X)  X  11^  -  llx  11^  =  II  X(A^  -  x)  +  X  11^  -  ||x||^  = 

2  2 

X  II  A^  -  x||  +  2X  ^  A^  -  X,  X  ^  .  But  this  inequality  would  be  violated 

for  small  X  >  0  unless  ^A^-x,  x^  ^0.  Thus  ^A^,  x  ^  >  ^x,  x^  >0  . 

This  shows  that  x  is  a  solution  of  the  system  of  inequalities.  For  the 

m 

converse,  assume  0  C  K  .  Then  0  =  E  X.  A.  with  X.  >  0  and  EX.  =1  . 

’  .,11  1  —  1 

1=1 

Thus  0  =  ^0,  x^  =  EX.  ^A.  ,x^  .  This  cannot  be  true  if  all 

n 

^  A.  ,  X  ^  >  0  .  The  notation  u,  v  ^  denotes  E  u  .  v  . 

^  j=l  ^  ^ 

Now  we  return  to  the  problem  of  minimizing  the  function 

n 

A(x)  =  max  |  E  a.  .x.  -  d.  |  . 

l<i<m  j=l  ^  ^ 


How  will  we  know  when  a  point  x  =  (x^,...,  x^)  is  a  solution?  It  must 
be  impossible  to  decrease  A  by  mo-^ng  slightly  in  any  direction  from  x  , 
Among  the  residuals 


r.(x)=Ea..x.-d.  i=l,,.  m 

X  .1-13  X 


let  us  single  out  those  which  are  mcLximum  in  absolute  value.  By  renumbering 
the  original  equations  we  could  assume 


|rj^(x)|  =  Ir^Cx)!  =  ...  =  |r^(x)|  >  (x  =  1,  2,...  )  . 


Thus  k  is  the  number  of  residuals  which  are  equal  in  absolute  value  to 
A(x)  ,  and  we  have  assumed  that  these  are  the  first  k  .  Now  if  we  are 
to  decrease  A(x)  by  changing  x  ,  we  will  have  to  decreaise  all  the  numbers 


|r^(x)l,  Ir^Cx)!  ,  ...  ,  lrj^(x)| 


The  remaining  residuals  which  are  less  can  be  ignored  if  only  a  slight 
change  is  contemplated  in  x  .  Suppose  that  we  move  from  x  in  the 
direction  z  .  How  are  the  residuals  affected?  A  computation  shows: 

r.(x  +  Xz)=Ea..(x.+Xz.)-d.  =2a..x.-d.  +  XEa..z.  = 

1  1  J  1  J  1  J 

=  r^(x)  +  X  <  A^,  z  >  . 

Thus  r^  increases  when  ^  A^,  z  ^  is  positive J  it  decreases  if 

^  A^,  z  ^  is  negative!  and  it  remains  constant  if  ^  A^,  z  ^  =  0 

In  order  to  decrease  |r^(x)l  by  moving  in  direction  z  ,  then, 

should  be  of  opposite  sign  to  r.(x)  .  Let  us  define  a.  =  sgn  r. (x)  , 

•  1  11 

so  that  o^  =  +1,  0,-1  according  as  r^(x)  >  0,  =  0,  <  0  .  The  direction 
z  that  we  are  seeking  must  then  have  the  property 

z  >  <  C  ,  02  <A2,  z>  <0  . <\’  ^>  ^  °  * 

In  other  words,  z  must  be  a  solution  of  the  system  of  linear  inequalities 

<Oi  A^,z>  <0  (i=l,,..,k). 

•  m 

At  any  point,  x  ,  we  can  define  such  a  system  of  inequali1^.es  by  singling 
out  the  residuals  which  are  mELximum  in  absolute  value  and^^Letting  o^ 
denote  the  sign  of  these  residuals.  If  this  system  of  linear  inequalities 
is  consistent .  then  x  is  not  a  solution,  for  a  slight  displacement  of 
X  in  an  appropriate  direction  z  will  decrease  A(x)  .  If  the  system 
of  linear  inequalities  is  inconsistent .  then  there  is  no  direction  in  which 
all  the  maximum  residuals  decrease  and  hence  x  is  a  solution. 


Let  us  assume  as  a  non-degeneracy  hypothesis  that  every  set  of  n 
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vectors  selected  from  the  set  [A  , A  }  is  independent .  An  equi- 


vEilent  hypothesis  is  that  every  n  X  n  submatrix  from  A  =  is 

non-singular.  Under  this  hypothesis  we  can  prove  that  at  the  solution 
point  X  there  must  be  at  least  n  +  1  maximum  residuals  in  absolute 
value.  Suppose  that  x  is  a  point  at  which  there  are  n  or  fewer 
maximum  residuals,  say  ’  where  k  <  n  .  Then  we  can  solve 

the  system  of  equations 


<  ,  z  > 


=  -  a. 
1 


i  =  1, . . . ,  k 


© 


because-  of  our  non-degeneracy  assumption,  and  the  resulting  direction  z 
will  be  one  in  which  A  decreases,  because  ^  a .  A.  ,  z  y  <0  . 

(Recall  that  in  the  simple  example  with  n  =  1  there  were  two  equal 
maximum  residuals  in. absolute  value  at  the  minimum  point.)  ■  In  the  above  ’ 
argument  the  case  =  0  does  not  arise  unless  all  =*0'  ,  when  there 
will  be  m  equal  maximum  residuals.  ^ 

•  • 

In  preceding  theorems,  we  have  shown  that,  if  the  system, of  linear 
^  inequalities  '  • 

<A^,z)<0-  •  i  =  l,-.  ..,k 

* 

is  to  be  inconsistent ,  then  0CX.{a^,  A^,...,  A^^]  .  We  have  also  shown  • 

.  that  0  must  lie  in  the  convex  hull  of  no  more  than  n  -t-  1  of  the  points 
Af,...,  Aj^  .  We  have  therefore  proved  the  following  theorem* 


Theorem  A  minimax  solution  of  a  system  of  m  equations  in 
n  unknowns,  where  m  >  n  is  a  rainimaix  solution  of  a  certain 
subsystem  comprising  n  +  1  equations. 
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The  next  thing  to  do  then  is  to  see  how  to  comput^the  minimax 
solution  of  a  system  of  n  +  1  equations  in  n  unknowns.  For  con¬ 
venience  in  notation  let  the  system  be  written 


<  A.  ,  =  d. 

(i  =  0, . . . ,  n)  . 

Suppose  that 

by  some  means  we  are  able 

to  obtain  a  point  x  =  (x^ , . . . , x  ) 

1’  ’  n 

a  number  e 

,  and  signs  o^  (”1»  1)  in 

such  a  way  that  the  two  following 

conditions  are  satisfied. 

'  (1) 

A.  ,  X  ^  -  d..  =  a.  e 
^  1  '  1  1 

(i  =  0, . . . ,  n) 

(2) 

OC  ^[cIq  Aq,  A^, . . 

.  ,  a  A  } 

’  n  n 

Then  we  would  be  finished,  for  by  (l),  all  n  +  1  residuals  are  equal  to 

|e|  in  magnitude,  and  by  (2),  no  reduction  in  the  number 

•) 

A(x)  =  max  |  <C  A.  ,  x  ^  -  d.  | 

0<i<n  ^  ^ 

—  — 

ie  possible.  Probably  the  easiest  way  to  obtain  x  ,  e  ,  and  is 

as  follows . 

First,  obtain  any  non-trivial  solution  of  the  homogeneous 

equations 

n 

E  X.  A.  =  0  . 

i=0  ^  ^ 

Second,  define  =  sgn  X^  (i  =  0,...,  n)  .  If  any  X^  vanishes, 
then  the  non-degeneracy  assumption  hats  been  violated.  Since  Z 
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and  a.  >  0  ,  we  have  already  secured  condition  (2)  above. 

11 


Third,  put  e  =  -SX.  d./EX.  o. 
- ’  1111 


Fourth,  solve  for  x  in  the  system 


/  A.  ,  X  )>  =  d.  +  0.  ' 
N  i’  '  1  1 


(i  =  1, . . . ,  n) 


This  system  is  solvable  because  it  is  n  equations  in  n  unknowns, 
with  a  non-singular  matrix.  We  almost  have  fulfilled  condition  (1). 


We  must  merely  check  to  verify  that 


<  Aq, X  >  =  dQ  +  Oq  e 


To  do  this  compute  as  follows: 


E  X.  r.  (*)  =  E.  X.  C  <  A.  ,  X  >  -  d.]  =  ./EX.  A.  ,  x>  -  E  X.  d, 

qIi  ^  1  ^  ii’/  i-i 

#  ■ 

m  n  .  ■*  * 

=  -  2X.d.  =:eEX.a.  =  eX_o  +SX.ea.  = 

11  11-  0  0  ^  1  0,  1 


Cancelling,  we  get 


XoTqCx)  =  e  XqOq 


which  was  to  be  proved. 


Remark  1  If  the  signs  were  known,  then  we  could 

treat  condition  (1)  as  a  system  of  n  +  1  linear  equations 
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in  the  unknowns  x^,  e  and  be  done  immediately. 

This  is  usually  possible  in  polynomial  approximation  problems. 

Remark  2  If  we  assume  only  that  ®ome  set  of  n  vectors 
from  {Aq,...,  A^}  is  independent  (rather  than  all  sets), 
then  the  only  change  necessary  in  the  above  algorithm  is  in 
solving  for  x  ,  where  we  would  have  to  select  a  set  of  n 
rows  with  a  non-singular  matrix  in  order  for  Gaussian  elimi¬ 
nation  to  work  smoothly. 
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LECTURE  III 


We  now  return  to  the  more  general  problem  of  m  equations  in 
n  unknowns.  The  minimax  solution  of  the  whole  system  is  also  the 
minimax  solution  of  an  (n  +  1)  -  subsystem,  and  we  must  systematize 
the  search  for  this  subsystem.  Let  us  start  by  calculating  the  mini- 
(gjmax  solution  of  siny  subsystem 

^  y  =  ( j  =  0, . . .  ,  n)  , 

j  j 

where  the  indices  i^,  i, , . . . ,  i  are  any  n  +  1  taken  from  the  set 

0’  1’  ’  n 

•  {l,  2,...,  m}  .  Having  the  point  x  , ,we  now  compute  all  residuals 


r.  (x)  =  <(  A.  ,  X  y  -  d. 

1  •  '  1  ’  '  1 


Of  course,  n  +  ,1  of  these  should  be” equal  in  magnitude,  but  unless  we 

» 

are  very  lucky  they'  will  not  be  the  maximum  ones.  If  |r.  (x)  |  |  r.  (x)| 

•  *  ^0  .  ^n 

were  the.  maximum  .residuals,  then  x  would  be  the  minimax  solution  of 

#  . 

the  entire  system.  So  we  select  an  index  a  such  that 


Ir'(x)l 


A(x) 


max  I  r . (x)  1 
1  <  i  <  m  ^ 


and  we  are  now  going  to  replace  one  of  the  indices  iQ,...,  i^  by  a 

and  repeat  the  entire  process.  Strangely  enough,  the  index  i.  which 

«3 

is  to  be  replSced  is  uniquely  determined  by  the  condition  that 


OC  H  {o  A  a.  A.  a.  A.  ,  a.  A.  A.  ]  . 

0  0  j-1  j-1  j+1  j+1  n  n 


% 
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Here  as  usual  we  have  put  =  sgn  r^(x)  .  The  formal  statement  of 

this  fact  is  known  as  the  exchange  theorem. 


Exchange  Theorem 


In  E  let  a  be  a  set 

n  0’  ’  n+1 


of  vectors  of  which  every  set  of  n  is  independent .  Suppose 

that  0  is  in  the  convex  hull  of  [A^,...,  A  ]  ,  Then  there 

0’  ’  n 

is  a  unique  index  p  such  that  0  is  in  the  convex  hull  of 

f  A  A  A  A  ] 

'•"O’---’  3-1’  p+1’-'*’  n+1-'  • 


Proof  By  hypothesis  there  exist  constants  >  0  such  that  0  =  S  ©.A. 

^  i=0  ^  ^ 

and  E  ©.  1  '  .  Since  every  set  of  n  vectors  is  independent,  0.  >  0  . 

n  ^ 

We  may  express  A  ^  as  a  linear  combination  A  ,  =  S  u.  A.  ,  and  all 
n+1  n+1  .  ^  1  1 

possible  expressions  of  0  as  a  linear  combination  of  A  are 

m  0’  ’  n+1 

encompassed  by  the' equation 


s[A  -  E  (p.  -  t©.)A,J  =  0, 
n+1  .  ^  1  11  ‘ 

1=0 


where  t  aq|i  s  are  real  variables.  If  s  >  0  ,  then  for  large  t  the 
Coefficients  -  +  t  are  all  positive,  and  for  an  appropriate  value 

of  t  ,  all  of  these  coefficients  are  positive  save  one  which  vanishes. 
(Specifically,  we  take  t  equal  to  the  largest  ratio  p^/©.  .)  No  more 
than  1  coefficient  vanishes  a  time  for  otherwise  there  is  a  contra¬ 


diction  of  the  hypothesis 


rning  independence.  The  index  of  the 


vanishing  coefficient  is  ther^^re  uniquely  determined  if  we  require  that 
the  remaining  coefficients  be  positive. 
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This  theorem  is  due  to  E.  Stiefel,  '’Numerical  Methods  of  Tchebycheff 
Approximation",  pp.  217-231  in  the  book  "On  Numerical  Approximation",  R. 
Langer,  editor,  Madison,  1959.  The  reader  should  consult  also  E.  Stiefel, 
"Note  on  Jordan  Elimination,  Linear  Programming,  and  Tchebycheff  Approxi¬ 
mation",  Numerische  Mathematik  2  (I96O)  1-17. 


In  broad  outline  our  algorithm  is  this: 


O 


1. 


Given  any  set  of  n  +  1  indices,  I  =  [i^,... 
Tchebycheff  solution  of  the  n  +  1  equations 
(k  =  0,.,,,  n)  .  Let  the  point  obtained  be  x 


calculate  the 


E  a.  .X .  = 

>1  V  ^ 

=  (x 


2.  Let  a  be  an  index  for  which  |r^(x)|  =  A(x). 

3.  Perform  an  "exchange"  of  a  with  an  appropriate  index  from  I 
Having  this  new  set  of  indices,  I  ,  return  to  step  1  . 


In  detail  the  algorithm,  is  as  follows. 


1.  Select  any  set  of  indices  I  =  {i  , i  }  .  By  Gaussian  elimination, 

U  n  . 

calculate  any  non-trivial  solution  (Xq,...,  X  )  of  the  homogeneous 
n  ^ 

equations  E  a.  ,  X  =  O  (j  =  1,...,  n).  Define  a  =  sgn  X 

k=0  ^k'^  ^  ^  ^  , 

(k  =  0,.,.,n)  .  If  any  X^^  =  0  ,  note  this  fact.  Define 

n  n 

e  =  -  E  X,  d.  /E  iX,  I  . 
k=0  ^k  k=0 


By  Gaussian  elimination,  calculate  the  solution  x  =  (x, .  x  )  of 

n  1 

the  equations  E  a.  .  x.  =  d.  +  o  e  (k  =  1,...,  n)  .  Test  to 

j  =  l  V  ^  ^k 
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see  whether  E  a.  .  x.  -  d.  -  e  =  0  ,  Record  this  number. 

j=i  V  J  ° 


2.  Calculate  the  numbers  r.  (x)  =  E  a.  .x.  -  d.  (i  =  1,...,  m)  . 

1  j^l  13  J 

Select  a  so  that  |r  (x)l  is  a  maximum.  Put  a  =  sgn  r  (x) 

CL  CL  OL 

If  Ir  (x)l  <  I  c  1  ,  stop,  for  x  is  the  Tchebycheff  solution. 

a  — 


3.  By  Gaussian  elimination,  calculate  the  solution  p.  -  (p^,.,.,  p^) 

n 

of  the  equations  E  p  o  a.  .  a  a  (j  =  1,...,  n)  .  Define 

kt  IC  rC  1,  J  oc  ol  . 

=1  j 

Pq  =  0  .  Let  p  be  the  index  of  the  largest  ratio  • 

Replace  i„  by  a  ,  and  return  to  step  1  with  the  new  set  of  indices  I 
P 


There  are  some  methods  of  streamlining  the  above  computations  to 


save  a  little  computer  time.  But  I  believe  the  above  algoriiHl  to  be 


superior  in  maintaining  accuracy  throughout  the  calculation. 


It  is  ah  interesting  fact  that  any  algorithm  for  solving  over¬ 
determined  systems  of  linear  equations  can  be  used  without  modification 
to  solve  another  type  of  problem,  viz.  solving  systems  of  linear  in¬ 
equalities.  A  system  of  linear  inequalities  looks  typically  like  this: 


E  a.  .  X  .  <  d. 
j=l  J  -  " 


(i  =  1, . . . ,  m) 


We  have  here  a  system  of  m  inequalities  in  n  unknowns,  and  we  seek  a 
solution  X  ,  if  one  exists.  If  one  exists,  the  system  is  said  to  be 
consistent .  We  have  the  following  theorem: 


Theorem 


If  the  system  of  linear  inequalities  above  is  consistent. 
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then  for  sufficiently  large  constants  Q  ,  eve|ry  Tchebycheff 
solution  of 

n 

E  a.  .  X.  =  d.  -  Q  (i  1,...,  m)  (2) 

J=1  ^ 

is  a  solution  of  the  inequalities. 

Proo.^  Let  y  be  any  solution  of  the  inequalities .  Let 

Q  >  max  (d.  -  E  a.  ,  y.)  ,  Let  x  be  a  Tchebycheff  solution  of  (2)  . 

-  3 

Then 

max  (Ea.,x.  -d.  +Q)<  max  |Ea..x.-d.  +  Q|< 

<  max  lEa..y.-d.  +  Q1  =  max  (Ea..y.-d.  +  Q)<Q  , 

13  J  ^ 

whence  E  a. .  x,  <  d.  .  A  similar  result  for  least-squares  solutions 
*  13  3  “  1 

is  not  true. 
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LECTURE  IV 


By  the  "rational  approximation  problem"  we  mean  the  following. 

Let  a  continuous  function  f  defined  on  an  interval  [a,b3  be  given. 
Let  n  Eind  m  be  two  prescribed  integers.  Determine  then  the  optimum 
coefficients  Pq, . . . ,  p^  ,  q^  in  the  approximation 


f(x) 


Pq  -*■  p^X^  +  .  .  .  +  P^X^ 

qQ  +  q^^x  +  q^x^  +  .  .  .  +  q^x” 


As  usual  we  would  like  a  Tchebycheff  solution,  and  thus  seek  to  minimize 
the  expression 


A  =  max  1  f(x) 
a  <  b 


P(x) 

Q(x) 


where  P  and  Q  are  respectively  the  numerator  and  denominator  of  the 
rational  function.  Of  course,  there  is  a  discrete  analogue  of  this  problem 
in  which  we  select  a  finite  set  of  points  from  the  interval  [a,b] 


The  existence  of  the  optimum  coefficients  is  guaranteed  by  a  theorem 
*  in  Achieser,  Theory  of  Approximation ,  Ungar,  1956,  p.  53  •  (The  proof  has 
a  flaw  but  is  essentially  correct.)  It  is  necessary  to  emphasize  that  the 
existence  is  provable  for  best  rational  approximations  on  an  interval  but 
not  for  a  discrete  subset  I  For  example,  suppose  f  is  a  continuous 
function  defined  on  [0,2]  such  that  f(0)  =  1  ,  f(l)  =  0  ,  and  f(2)  =  0  . 
Let  us  attempt  to  approximate  f  by  a  rational  function  of  the  form 
a/(bx  +  c)  at  just  the  points  0,  1^  aind  2  .  Now  the  deviation  of  the 

function  e/(x  +  e)  from  f  on  the  three  given  points  is  e  /  (1  +  c)  , 
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and  this  can  be  made  arbitrarily  small  by  making  e  small.  Thus 

inf  max  |  f(x)  -  - — ~ —  1  =  0. 

o  bx  +  c 

a , b , c  x=0 ,1,2 

Tet  no  choice  of  a,  b,  and  c  will  achieve  for  us  this  minimum.  This 
example  is  due  to  P.  C.  Curtis,  Jr.  Another  phenomenon  that  may  occur 
in  the  discrete  problem  is  that  after  obtaining  a  good  rational  approxi¬ 
mation  for  the  discrete  point  set,  one  may  discover  that  the  denominator 
vanishes  at  intermediate  points,  so  rendering  the  approximation  useless. 

For  example,  let  us  try  to  approximate  f(x)  =  x  on  [-1,1]  by  a/(bx+c) 
On  the  subset 

X  =  [  X  :  -  <  |x|  <  1  } 

— n  n  —  — 

the  best  approximation  of  the  stated  form  is  ^  ,  but  this  has  a  pole 
at  x  =  0  .  This  remains  true  as  n  -*  (d  .  This  example  is  also  due  to 
Curtis.  Thus  certain  pitfalls  await  the  unwary  in  this  field.  t 

One  may  ask  whether,  in  view  of  these  drawbacks,  rational  approxi¬ 
mations  are  worth  our  study.  After  all,  polynomials  are  sufficient  for 
the  approximation  of  any  continuous  function  -  so  says  the  Weierstrass 
Theorem.  Nevertheless,  a  polynomial  approximation  to  a  given  continuous 
function  may  fail  to  have  the  required  accuracy  unless  its  degree  becomes 
very  large.  In  these  cases  a  rational  approximation  will  sometimes  pro¬ 
vide  a  spectacular  improvement.  From  the  standpoint  of  economizing  the 
computing  time,  rational  functions  are  also  recommended.  The  reason  for 
this  is  that  a  rational  function  cein  always  be  converted  into  an  equi¬ 
valent  continued  fraction  for  fast  computing.  To  illustrate,  consider  the 


25 


rational  function 


R(x)  = 


^  T  3  2  _  , 

X  +  2x  -  2x  -  2x-h4 

3  2 

x+2x  +2x+6 


It  would  appear  that  eight  "long  operations"  (multiplications  and 
divisions)  would  be  necessary  to  compute  R(x)  .  However,  if  we  perform 
the  long  division  indicated,  we  get 


-4x  -8x+4 

x^  +  2x^  +  2x  +  6 


Now  write  this  as 


R(x)  =  X  - 


^  2 

x''^-i-2x  ■t-2x+6 

x^  +  2x  -  1 


We  again  perform  the  indicated  long  division  to  obtain 


R(x)  =  X  - 


1  X  +  6 


X  +  2x  -  1 


R(x)  =  X  - 


x  +  2  X  -  1 
X  +  2 
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Continuing  in  this  way  we  obtain  finally 


R(x)  =  X 


Jt _ 2 _ i_ 

X  -  X  -  X  +  2 


which  obviously  entails  only  3  long  operations.  Nevertheless,  the 
rational  function  that  we  started  with  has  roughly  the  same  curve-fitting 
ability  as  a  polynomial  of  degree  seven  or  eight, 


One  of  the  simplest  algorithms  for  obtaining  rational  approximations 
d(r^ends  on  having  at  hand  a  program  for  solving  liiiear  inequalities.  (In 
this  connection  see  remarks  made  in  lecture  3.)  Suppose  we  wish  an 
approximation  of  the  form 


f(x) 


n  .  , 

S  c  .  x^ 

^  _ 

N 

1  +  i:  c .  x^-" 
n+1  ^ 


Let  us  require  that  the  approximation  shall  deviate  from  f 
an  amount  e  at  a  large  number  of  points  x^ 

require  also  that  the  denominator  remain  >  6  >  0  at  these 
Our  requirements  are  then 


no  more  than 
Let  us 
m  points. 


f(x.)  - 

1 


E  c  .  X, 

j-1  ^  ^ 


j-1 


N 

1  +  E 
n+1 


;  .  X . 

J  1 


J-n 


<  e 


1+  E  c.  x.'^'^>6 


n+1 


J  1 
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This  becomes  a  system  of  linear  inequalities  if  we  replace  an  inequality 
1 y  I  <  G  by  two  inequalities  y  <  g  ,  -  y  <  g  ,  and  then  clear  of 
fractions .  The  result  is 


N  n 

[f(x. )  -  e]  Sc.  x.^  ^  -  S  c  .  x.^  <  G  -  f(x.  ) 

1  -iJi  .11—  1 

n+1  1 


11  .  .11  .  _ 

[  -  f(x. )  -  g]  Sc.  x.^  ^  +  S  c  .  x.^  <  e  +  f(x. ) 

n+1  1 

N 

—  Sc  x^^  <  1  —  6 
n+1  j  ^  - 


If  we  select  too  small  an  g  this  system  will  have  no  solution 
(c,,...,  c  c,,)  .  But  by  trial  and  error,  we  can  discover  an 

appropriate  value  of  g  and  a  satisfactory  rational  approximation. 


Another  simple  and  very  effective  method  which  can  be  used  has  been 
termed  the  Weighted  Minimax  Algorithm.  It  requires  that  one  already  possess 
a  program  for  solving  overdetermined  systems  of  linear  equations  in  the 
minimeix  or  Tchebycheff  sense.  This  algorithm  suffers  from  the  disadvantage 
that  a  proof  of  its  effectiveness  is  still  lacking.  Nevertheless,  the  ease 
with  which  it  may  be  programmed  aind  the  success  which  it  has  had  recommend 
it  highly.  To  explain  it,  suppose  again  that  we  seek  an  approximation 


f(x) 


P(x) 

Q(x) 


where  P  and  Q  are  polynomials  of  prescribed  degrees  whose  coefficients 
we  wish  to  determine.  We  seek,  of  course,  to  render’  the  following  expression 


an  absolute  minimum: 

I  X  P(x)  I 

max  1  f(x)  -  -^1  . 

a<  X  <  b 

This  can  be  written 

max  I  I  .  1  f(x)  Q(x)  -  P(x)  |  . 

a<x<b 

Now  in  this  algorithm  we  consider  |  1  to  be  a  weight  function,  w(x) 

In  the  first  step  we  take  w(x)  =  1  ,  and  attempt  to  minimize 

max  1  f(x)  Q(x)  -  P(x)  1 
a  <  X  <  b 

To  avoid  trivial  solutions,  we  may  fix  a  term  in  Q(x)  ,  say  by  requiring 
Q(0)  =  1  .  Then  we  compute  a  new  weight  function  w(x)  and  repeat  the 

process.  Formally,  we  have  the  following  iterative  algorithm. 

1.  Using  w.  =  1  minimize  the  expression  max  w.  |  f(x.  )  i^(x.  )  -  P(x.  ) 

i  111-  1 

1  <  1  ^  m 

by  use  of  the  program  for  overdetermined  linear  equations.  Call  the 
solution  Qq  and  P^ 


2. 

Set 

w . 

1 

=  1/  IQq(x.)1 

and  minimize  the  expression 

max 

i 

w . 

1 

1  f(x.  )  Q(x. ) 
11 

-  P(x^)l  .  Call  the  solution 

3. 

Set 

w . 

1 

=  1/1  ^i(x.)| 

and  continue  in  this  way. 

Experience  has  shown  that  using  double  precision  arithmetic  and  about  10 
steps  in  the  above  algorithm  we  cam  usually  obtain  best  Tchebycheff  rational 
approximations  with  up  to  15  coefficients. 


LEJCTDRE  V 


The  next  algorithm  to  be  discussed  is  perhaps  not  as  easily 
programmed  as  the  Weighted  Minimax  Algorithm  but  rests  on  firmer  ground 
mathematically.  Again  we  are  attempting  to  optimize  the  coefficients  in 
two  polynomials  P  and  Q  so  that 

on  [a,b]  . 


We  find  it  convenient  to  define  R(c,x)  =  P(c,x)/  Q(c,x)  , 


P(c,x)  =  S  c.  x 
i=l  ^ 


i-1 


N 

Q(c,x)  =  E  c.  X 
i=n+l  ^ 


i-n-1 


and 

A(c)  =  max  |  f(x)  -  1 

a,<x<b 

9 

The  letter  c  stands  for  the  N-tuple  c  .  c„)  .  Since 

numerator  and  denominator  can  be  multiplied  by  the  same  number  without 
chcinging  the  rational  function,  no  loss  of  generality  occurs  in  restricting 
c  to  the  cube 


K  =  {  c  C  E  ;  max  c .  1  <  1  } 
—  n  .  1  — 

1 


Algorithm  The  N-tuple  c^  may  be  arbitrary  except  that  Q(c^,x)  >  0 


in  Ca,b]  .  This  gets  the  process  started.  Now  at  any  stage,  with 
N-tuple  o'*  on  hand,  define  an  auxilliary  function 


6  (c)  =  max  {lf(x)Q(c,x)  -  P(c,x)|  -  A(c'^)  Q(c  ,x) } 
V  a  <  X  <  b 
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Notice  that  the  function  remains  substantially  the  same  throughout  the 
computation^  it  changes  only  in  the  single  coefficient  ACc'^)  .  Now 
select  c'^^^  to  minimize  this  auxilliary  function  on  the  cube  K 
(This  is  a  problem  of  "convex  programming"  -  minimizing  a  convex  function 
on  a  convex  set.)  If  <  O  ,  repeat  the  whole  process.  If 

>  0  ,  stop,  for  was  a  solution. 


Theorem 


As 


(D  ,  A(c^)  f  inf  A(c) 


c  E.. 


Proof  A.  If  Xq)  <  O  ever  occurs  for  C  [a,b]  ,  let  v  be 

the  first  index  for  which  it  occurs.  Thus  ,  x^)  <0  but  Q(c'^,x)>'0 

for  all  X  .  We  show  c'^  is  a  solution.  If  not,  then  there  is  a  c' 
such  that  A(c')  <  A(c^)  ,  We  may  assume  c'C  K  ,  and  that  Q(c',x)  >  0 
in  [a,b]  ,  for  if  Q(c',x)  vanishes  anywhere  in  [a,b]  we  can  divide 
a  common  factor  out  of  numerator  aind  denominator.  Thus  6  (c'*'''^)  <6  (o')  = 

V  —  V 


=  maix 

X 


ic  lf(x)  -  R(c',x)l  -  ACc'^)  ]  Q(c',x)}  <  0  .  But  6  (c''"''^)  > 


>  i  f  (x^)  Q  (c'^'*’  ,Xq)  -  PCc'^  “  ACc'^)  Q  (c’^  ,x^)  >0  ,  a  contradiction. 


B.  We  assume  Q(c'^,x)  >  O  always.  We  shall  show  that  <0  , 

equality  occurring  only  if  c'^  is  a  solution.  Indeed,  <  6^(c'^)  = 


max  [[1  f(x)  -  R(c  ,x)|  -  A(c  )]  Q(c  ,x)]  =  0 
i 


If 


is  not  a  solution 


then  as  in  part  A  above,  we  can  show  that  6  (c'^^''')  <  0 

V 

C.  A(c*^)  >  A(c')  >  ...  .  To  prove  this,  write  0  >  6^(c‘'*’^) 

=  max  f,[|  f(x)  -  R(c^^^,  x)  [  -  A(c^)]  x)}  > 

x 

>  P  C  max  1  f(x)  -  R(c'’^^,x)l  -  A(c^)]  =  p  [ACc'^^^)  -  A(c'^)]  where 

X 

p  =  ma:-c  max  «i(  c  ,  x ) 
c  G  K  a  <  x  <  b 


*■ 
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D.  Put  A*  =  inf  A(c)  .  Put  L=lim  ACc'^)  .  The  number  L  is 
c  C  K  v-*cD 

well-defined  because  it  is  the  limit  of  a  decreasing  sequence  of  non¬ 
negative  numbers.  If  L  /  A*  ,  then  there  is  a  c'GK  such  that 
A(c')  <  L  .  We  may  assume  that  Q(c’,x)  >  0  on  [a,bj  .  Then 
1  f(x)  -  R(c',x)l  <  A(c')  <  L  <  A(c'^)  ,  so  that  6^(c'^^^)  <  6^(c')  = 

=  max  {[]  f(x)  -  R(c',x)|  -  A(c'’)3  Q(c',x)}<  a[A(c')  -  ACc'*)]  where 

X 

a  =  min  Q(c',x)  >  0  .  Thus 

a  <  X  <  b 

ACc'^l)  <  i  6  (c'"^!)  +  ACc'') 

-  p  V 

<  I  [A(c‘)  -  A(c'')]  +  ACc'^)  . 

Now  in  this  last  inequality,  let  v  -♦  od  .  The  result  is 
L  <  ^  [A(c.')  -  L  ]  +  L 

whence  0  <  A(c')  -  L  ,  a  contradiction.  This  concludes  our  proof.  This 
algorithm  occurs  in  E.  W.  Cheney  and  H.  L.  Loeb ,  "On  Rational  Chebyshev 
Approximation",  Numerische  Mathematik,  Summer,  1962.  The  convex  programming 
problem  is  discussed,  among  other  places,  in  E.  W.  Cheney  and  A.  A.  Goldstein, 
"Newton's  Method  for  Convex  Programming  and  Tchebycheff  Approximation", 
Numerische  Mathematik  1  (1959)  253~268. 

The  last  type  of  rational  approximation  which  we  wish  to  discuss  is 
the  so-called  Pad4  approximation .  Let  f  be  an  analytic  function,  given 
by  a  Taylor  series 

®  k 

f(x)  =  r  a  X 

k=0  ^ 
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A  rational  function 


0  ^ 


0  ^ 


is  a  Fade  approximation  to  f  if  ^  approximately  zero,  in 

the  sense  that  it  has  a  power  series  in  which  the  first  n  +  m  +  1 
coefficients  vanish: 


Ci^(x)f(x)  -  P^(x)  =  E  cj^x 

k=n+m+l 


For  an  example,  let  us  determine  a  Fade  approximation  for  e  of  the 


Pq  Pj  ^ 

qQ  +  ^1 X  +  <12 


2 


We  must  compute  p^,  p^,  q^,  q^,  in  accordance  with 

(qQ  +  q^^x  +  x‘^)(l  +  X  +  ^  ^  +  ...  )  -  (Pq  +  p^x)  =  c^x^  +  c^x^  + 


Equating  coefficients  of  like  powers  of  x  ,  we  have 


%  -  PO  =  ° 


"  'll  -  Pi  =  ° 


2  qo  q2  =  0 


3  ^0  "  2  'll  ^  ^12  = 


0 
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A  convenient  solution  is 


6  +  4x 
6  -  2x  + 


Putting  into  continued  fraction  form,  we  have 


7  3 

X  -  -  +  X  +  I 


The  general  procedure  is  as  follows. 


(  S  q,  x^)(  E  a,  x^)  -  E  ■  p,  x^ 
k=0  ^  k.O  k=0  ^ 


S  c  x" 
k=n+m+l 


We  multiply  the  series  together  and  combine: 

®  ^  k  ^  k 

^  ^  ^  ^k-i  ~  ^^k^  ^  ^  • 

k=0  j=0  ^  ^  J  ^  n+m+1  ^ 


Again  equate  coefficients  to  obtain  the  linear  equations: 


2  a  q  -  p  =  0 
k=0  ^  J  ^ 


p,  =  0  when  k  >  m 

k 


q,  =  0  when  k  >  n 
k 


qQ  1  ■ 


(k=0,  1,  ..,,n  +  m) 


The  last  equation  simply  helps  us  to  settle  upon  a  particular  solution  of 
the  homogeneous  equations.  Sometimes  we  are  forced  to  specify  some  co¬ 


efficients  other  than  q 


A  formula  for  the  coefficients  c,  is 

k 


c,  =  ^  a  q  (k  >  m  +  n) 

j=0  J 
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Since 


f(x) 


m 

Q  (x) 
n 


^  k 

E  c.  X 

k  >n+m _ 

Qn^x) 


the  c*s  help  in  assessing  the  error.  For  example  in  the  case  of 
with  m  =  1  and  n  =  2  we  have  for  k  >  3  , 


X 


e 


°k  =  ^  ‘lo  Vl  ^1  ^  ^-2  ^2 

_  6_  2  1 
“  kl  “  (k  -1)1  (k  -  2): 

_  (k  -  3)(k  -  2)' 
k  : 


Thus  c I  = 

4 

therefore 


1_ 

12 


etc.  The  error  function  is 


E  c,  X 

-\nr 


14  15  1  6 

12^  '20^  '*'60^ 

x^  -  2x  +  6 


When  X  is  near  zero,  the  denominator  is  near  6  ,  and  so  the  error  is 

1  4 

of  the  order  of  7^  x  .  Remember  that  this  is  obtained  from  an  expression 

which  involves  just  two  divisions.  In  the  general  case,  taking  n  =  m  , 

2n+l 

we  would  have  an  error  of  the  order  of  x  at  the  expense  of  n 

divisions  -  two  times  better  them  the  Taylor  series. 


